Maps

  1. Create maps of county-level obesity rate estimates for adults living in the contiguous United States using BRFSS data from 2003 and 2013. These estimates have already been age-adjusted using Census population estimates to allow for comparison between counties and across time.
    • Read in BRFSS obesity data and county polygons, naming them obesity and counties, respectively. Use the base plot function to check that counties includes the polygon elements you expect. Hint: reading in an RDS file from a website requires that you run the file through a decompressor before loading it via readRDS. R has a built-in decomopressor function called gzcon. (2 points)
obesity <- read.csv("https://raw.githubusercontent.com/HimesGroup/BMIN503/master/DataFiles/county_obesity_prevalence.csv", header = TRUE)

counties <- readRDS(gzcon(url("https://raw.githubusercontent.com/HimesGroup/BMIN503/master/DataFiles/uscounties_2010.rds")))

plot(st_geometry(counties))

+ What were the 2004 and 2014 obesity rates for Orange County, California? For Orange County, Texas? Show all variables associated with these counties in the BRFSS and county polygons datasets. Aside from county names, what identifiers do these datasets share? *(2 points)*
obesity[obesity$county == "Orange County" & obesity$state == "California",]
##          state fips.code        county age.adjusted.percent.2004
## 219 California      6059 Orange County                      18.4
##     age.adjusted.percent.2014
## 219                      19.4
counties[counties$STATE == "6" & counties$COUNTY == "059", ]
## Simple feature collection with 1 feature and 6 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -118.1167 ymin: 33.38663 xmax: -117.413 ymax: 33.94641
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##             GEO_ID STATE COUNTY   NAME   LSAD CENSUSAREA
## 683 0500000US06059     6    059 Orange County    790.568
##                           geometry
## 683 MULTIPOLYGON (((-118.093 33...
obesity[obesity$county == "Orange County" & obesity$state == "Texas",]
##      state fips.code        county age.adjusted.percent.2004
## 2707 Texas     48361 Orange County                      25.3
##      age.adjusted.percent.2014
## 2707                      29.2
counties[counties$STATE == "48" & counties$COUNTY == "361", ]
## Simple feature collection with 1 feature and 6 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -94.1176 ymin: 29.91442 xmax: -93.69525 ymax: 30.24432
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##              GEO_ID STATE COUNTY   NAME   LSAD CENSUSAREA
## 1194 0500000US48361    48    361 Orange County    333.669
##                            geometry
## 1194 MULTIPOLYGON (((-93.90145 3...

The obesity rate in Orange County, California was 18.4% in 2004 and 19.4% in 2014. The obesity rate in Orange County, Texas was 25.3% in 2004 and 29.2% in 2014. The fips.code variable in the BRFSS dataset match the STATE and COUNTY ID’s in the county polygons dataset.

+ Merge the two datasets so that `counties` contains state names and obesity rates for 2004 and 2014. *(3 points)*
library(tidyverse)

counties$fips.code <- paste0(counties$STATE, counties$COUNTY)
obesity$fips.code <- as.character(obesity$fips.code)

counties <- inner_join(counties, obesity, by = "fips.code")
head(counties)
## Simple feature collection with 6 features and 11 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -88.13999 ymin: 31.19518 xmax: -85.12342 ymax: 34.8922
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##           GEO_ID STATE COUNTY     NAME   LSAD CENSUSAREA fips.code   state
## 1 0500000US01001     1    001  Autauga County    594.436      1001 Alabama
## 2 0500000US01009     1    009   Blount County    644.776      1009 Alabama
## 3 0500000US01017     1    017 Chambers County    596.531      1017 Alabama
## 4 0500000US01021     1    021  Chilton County    692.854      1021 Alabama
## 5 0500000US01033     1    033  Colbert County    592.619      1033 Alabama
## 6 0500000US01045     1    045     Dale County    561.150      1045 Alabama
##            county age.adjusted.percent.2004 age.adjusted.percent.2014
## 1  Autauga County                      29.8                      33.8
## 2   Blount County                      25.4                      34.9
## 3 Chambers County                      30.5                        40
## 4  Chilton County                      27.6                        36
## 5  Colbert County                      27.4                      37.8
## 6     Dale County                      28.8                      36.6
##                         geometry
## 1 MULTIPOLYGON (((-86.49677 3...
## 2 MULTIPOLYGON (((-86.5778 33...
## 3 MULTIPOLYGON (((-85.18413 3...
## 4 MULTIPOLYGON (((-86.51734 3...
## 5 MULTIPOLYGON (((-88.13999 3...
## 6 MULTIPOLYGON (((-85.41644 3...
+ For each year (i.e., 2004 and 2014), create a static choropleth map of county-level obesity rates for the US using _ggplot2_. Add a title with `ggtitle`, remove county borders with `lwd=0` in the `geom_sf` call, and incorporate custom theme elements with the user-created `my_theme()` function. Some code to get you started with these maps is offered below. Feel free to change plot aesthetics or choose a different color palette. *(4 points)*
counties <- counties %>%
  mutate(age.adjusted.percent.2004 = as.numeric(as.character(age.adjusted.percent.2004)),
         age.adjusted.percent.2014 = as.numeric(as.character(age.adjusted.percent.2014)))

# Use a fixed color scale to more easily compare obesity rates between maps 
prev_min <- min(c(counties$age.adjusted.percent.2004, counties$age.adjusted.percent.2014), na.rm = TRUE)
prev_max <- max(c(counties$age.adjusted.percent.2004, counties$age.adjusted.percent.2014), na.rm = TRUE)

library(RColorBrewer)

my_theme <- function() {
  theme_minimal() +                                  
  theme(axis.line = element_blank(),                 
        axis.text = element_blank(),                 
        axis.title = element_blank(),
        panel.grid = element_line(color = "white"),  
        legend.key.size = unit(0.8, "cm"),          
        legend.text = element_text(size = 16),       
        legend.title = element_text(size = 16),
        plot.title = element_text(size = 22))      
}

myPalette <- colorRampPalette(brewer.pal(9, "YlOrRd"))

ggplot() +
  geom_sf(data = counties, aes(fill = age.adjusted.percent.2004), lwd = 0) +
  my_theme() +
  ggtitle("County-level obesity rates in 2004") + 
  scale_fill_gradientn(name = "Obesity rate (%)", colours = myPalette(100),
                       limit = range(prev_min, prev_max)) 

ggplot() +
  geom_sf(data = counties, aes(fill = age.adjusted.percent.2014), lwd = 0) +
  my_theme() +
  ggtitle("County-level obesity rates in 2014") + 
  scale_fill_gradientn(name = "Obesity rate (%)", colours = myPalette(100),
                       limit = range(prev_min, prev_max)) 

+ How did obesity rates in adults change between 2004 and 2014? (Qualitative answer is sufficient!) *(2 points)*    

Obesity rates rose substantially in most parts of the country, with many counties having a percentage point increase of approximately 10.

Interactive Maps

  1. Create an interactive map to visualize the change in adult obesity rates for all counties in the contiguous United States between 2004 and 2014.
    • Create a new variable in counties that tracks the change in obesity rate for each county between 2004 and 2014. Be sure to code this variable so that a positive value indicates an increase in the prevalence of obesity. (1 point)
counties$prevchange <- counties$age.adjusted.percent.2014 - counties$age.adjusted.percent.2004
pal_fun <- colorBin(palette = brewer.pal(9, "RdBu")[c(1:5, 7)], 
                    bins = c(-3, -1, 1, 5, 9, 13, 17), reverse = TRUE,
                    NULL)

pu_message <- paste0(counties$county, ", ", counties$state,
                     "<br>Change in obesity rate (2004-2014): ",
                     round(counties$prevchange, 1), "%")

leaflet(counties) %>%
  addPolygons(stroke = FALSE,                        
              fillColor = ~pal_fun(prevchange),
              fillOpacity = 0.8, smoothFactor = 0.5, 
              popup = pu_message) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%   
  addLegend("bottomright",                           
            pal=pal_fun,                             
            values=~prevchange,                     
            title = 'Change in obesity rate 2004-2014 (%)',             
            opacity = 1)  %>%                       
  addScaleBar()
  1. Create a choropleth map of a county-aggregated variable of your choice from the American Community Survey (ACS) 5-year estimates for 2012-2016.
vars <- load_variables(dataset = "acs5", year = 2016)
# Obtaining information about median house value
acs.data <- get_acs(geography = "county", variables = "B25077_001", year = 2016, survey = "acs5")

acs.data <- acs.data %>%
  select(GEOID, NAME, estimate) %>%
  rename(median_houseval = estimate)

head(acs.data)
## # A tibble: 6 x 3
##   GEOID NAME                    median_houseval
##   <chr> <chr>                             <dbl>
## 1 01001 Autauga County, Alabama          141000
## 2 01003 Baldwin County, Alabama          173400
## 3 01005 Barbour County, Alabama           90300
## 4 01007 Bibb County, Alabama              97200
## 5 01009 Blount County, Alabama           124200
## 6 01011 Bullock County, Alabama           68000
acs.data <- acs.data %>%
  mutate(fips.code = as.character(as.numeric(as.character(GEOID))))

counties <- inner_join(counties, acs.data, by = "fips.code")

head(counties)
## Simple feature collection with 6 features and 15 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -88.13999 ymin: 31.19518 xmax: -85.12342 ymax: 34.8922
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##           GEO_ID STATE COUNTY   NAME.x   LSAD CENSUSAREA fips.code   state
## 1 0500000US01001     1    001  Autauga County    594.436      1001 Alabama
## 2 0500000US01009     1    009   Blount County    644.776      1009 Alabama
## 3 0500000US01017     1    017 Chambers County    596.531      1017 Alabama
## 4 0500000US01021     1    021  Chilton County    692.854      1021 Alabama
## 5 0500000US01033     1    033  Colbert County    592.619      1033 Alabama
## 6 0500000US01045     1    045     Dale County    561.150      1045 Alabama
##            county age.adjusted.percent.2004 age.adjusted.percent.2014
## 1  Autauga County                      29.8                      33.8
## 2   Blount County                      25.4                      34.9
## 3 Chambers County                      30.5                      40.0
## 4  Chilton County                      27.6                      36.0
## 5  Colbert County                      27.4                      37.8
## 6     Dale County                      28.8                      36.6
##   prevchange GEOID                   NAME.y median_houseval
## 1        4.0 01001  Autauga County, Alabama          141000
## 2        9.5 01009   Blount County, Alabama          124200
## 3        9.5 01017 Chambers County, Alabama           83000
## 4        8.4 01021  Chilton County, Alabama           99100
## 5       10.4 01033  Colbert County, Alabama          102500
## 6        7.8 01045     Dale County, Alabama          103800
##                         geometry
## 1 MULTIPOLYGON (((-86.49677 3...
## 2 MULTIPOLYGON (((-86.5778 33...
## 3 MULTIPOLYGON (((-85.18413 3...
## 4 MULTIPOLYGON (((-86.51734 3...
## 5 MULTIPOLYGON (((-88.13999 3...
## 6 MULTIPOLYGON (((-85.41644 3...
library(leaflet)

pal_fun <- colorNumeric("BuPu", NULL)       

pu_message <- paste0(counties$county, ", ", counties$state,
                     "<br>Median House Value: ",
                     "$", counties$median_houseval)

leaflet(counties) %>%
  addPolygons(stroke = FALSE,                       
              fillColor = ~pal_fun(median_houseval),
              fillOpacity = 0.8, smoothFactor = 0.5,
              popup = pu_message) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%   
  addLegend("bottomright",                           
            pal=pal_fun,                             
            values=~median_houseval,                      
            title = 'Median House Value ($)',              
            opacity = 1)  %>%                        
  addScaleBar()

Alternatively, mapping percent poverty…

acs.data2 <- get_acs(geography = "county", variables = c("B17010_001", "B17010_002"), year = 2016, survey = "acs5")

acs.data2 <- acs.data2 %>%
  group_by(GEOID, NAME) %>%
  summarise(num_poverty = estimate[variable == "B17010_002"],
            num_total = estimate[variable == "B17010_001"]) %>%
  mutate(percent_poverty = round(num_poverty/num_total*100, 1), 
         fips.code = as.character(as.numeric(as.character(GEOID))))

counties2 <- inner_join(counties, acs.data2, by = "fips.code")

pal_fun <- colorNumeric("BuPu", NULL)       

pu_message <- paste0(counties2$county, ", ", counties2$state,
                     "<br>Poverty rate: ",
                     counties2$percent_poverty, "%")

leaflet(counties2) %>%
  addPolygons(stroke = FALSE,                       
              fillColor = ~pal_fun(percent_poverty),
              fillOpacity = 0.8, smoothFactor = 0.5,
              popup = pu_message) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%   
  addLegend("bottomright",                           
            pal=pal_fun,                             
            values=~percent_poverty,                      
            title = 'Poverty rate (%)',              
            opacity = 1)  %>%                        
  addScaleBar()

Median house value: House values are highest along the coast of the California Bay Area.

Percent poverty: Poverty rates are highest in many counties of Louisiana and Missippi, as well as southern South Dakota and the southern tip of Texas.